Improving Successor Variety for Morphological Segmentation

نویسنده

  • Çağrı Çöltekin
چکیده

Successor variety is a commonly used measure for segmentation in language processing. It is based on a simple idea that large variety of letters (or phonemes) following an initial word (or utterance) segment indicates a possible boundary. It dates back to Harris (1955), and several methods based on successor variety have been used in the literature, particularly for the purposes of segmenting words into morphemes. However, there have not been many studies analyzing the measure itself. Even though the idea is simple and effective, the current use in the literature does not utilize the measure to its full extent due to a number of problems with the successor variety scores. This paper intends to address these problems by introducing a normalization method, and demonstrates—using segmentation experiments on two typologically different languages— the effectiveness of this improvement on the morphological segmentation task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Trie-Structured Bayesian Model for Unsupervised Morphological Segmentation

In this paper, we introduce a trie-structured Bayesian model for unsupervised morphological segmentation. We adopt prior information from different sources in the model. We use neural word embeddings to discover words that are morphologically derived from each other and thereby that are semantically similar. We use letter successor variety counts obtained from tries that are built by neural wor...

متن کامل

Improving edge detection and watershed segmentation with anisotropic diffusion and morphological levellings

Edge preserving smoothing and image simplification is of fundamental importance in a variety of remote sensing applications during feature extraction and object detection procedures. The construction of a pre-processing filtering tool for edge detection and segmentation tasks is still an open matter. Towards this end, this paper brings together two advanced nonlinear scale space representations...

متن کامل

Improving Brain Magnetic Resonance Image (MRI) Segmentation via a Novel Algorithm based on Genetic and Regional Growth

Background: Regarding the importance of right diagnosis in medical applications, various methods have been exploited for processing medical images solar. The method of segmentation is used to analyze anal to miscall structures in medical imaging.Objective: This study describes a new method for brain Magnetic Resonance Image (MRI) segmentation via a novel algorithm based on genetic and regiona...

متن کامل

Robust Potato Color Image Segmentation using Adaptive Fuzzy Inference System

Potato image segmentation is an important part of image-based potato defect detection. This paper presents a robust potato color image segmentation through a combination of a fuzzy rule based system, an image thresholding based on Genetic Algorithm (GA) optimization and morphological operators. The proposed potato color image segmentation is robust against variation of background, distance and ...

متن کامل

Phonetic segmentation using multiple speech features

In this paper we propose a method for improving the performance of the segmentation of speech waveforms to phonetic segments. The proposed method is based on the well known Viterbi timealignment algorithm and utilizes the phonetic boundary predictions from multiple speech parameterization techniques. Specifically, we utilize the best, with respect to boundary type, phone transition position pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010